The Life and Death of Discourse Entities: Identifying Singleton Mentions

نویسندگان

  • Marta Recasens
  • Marie-Catherine de Marneffe
  • Christopher Potts
چکیده

A discourse typically involves numerous entities, but few are mentioned more than once. Distinguishing discourse entities that die out after just one mention (singletons) from those that lead longer lives (coreferent) would benefit NLP applications such as coreference resolution, protagonist identification, topic modeling, and discourse coherence. We build a logistic regression model for predicting the singleton/coreferent distinction, drawing on linguistic insights about how discourse entity lifespans are affected by syntactic and semantic features. The model is effective in its own right (78% accuracy), and incorporating it into a state-of-the-art coreference resolution system yields a significant improvement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of singleton mentions in Russian

Аннотация This paper describes a pilot study of the problem of detecting singleton mentions in Russian texts. A noun phrase is considered a singleton mention if it is the only referent of some entity. We discuss various morphosyntactic and lexical features, some of which were used for analogous tasks for English and propose new features derived from the discourse analysis. Testing the machine l...

متن کامل

A Comparative Analysis of Self-Mentions in Applied Linguistics PhD Dissertations Written by Native and Non-Native English Writers

The purpose of the present study was to compare the PhD dissertations written by native and nonnative English writers in the field of Applied Linguistics with regard to the use of self-mentions. To this end, 40 Applied Linguistics PhD dissertations (20 written by native English writers and 20 by non-native English writers), were selected randomly among academic texts written in 2007-2017. The p...

متن کامل

Entity-based Coreference Resolution combined with Discourse-New Detection

Anaphora and coreference resolution is a well-studied topic in NLP research, allowing a deeper understanding of the text than shallow methods by revealing discourse structures. Traditional systems reason over mentions, rather than entities, and perform clustering after the resolution process. In this work, general drawbacks with this approach are considered and related works employing knowledge...

متن کامل

Singleton Detection using Word Embeddings and Neural Networks

Singleton (or non-coreferential) mentions are a problem for coreference resolution systems, and identifying singletons before mentions are linked improves resolution performance. Here, a singleton detection system based on word embeddings and neural networks is presented, which achieves state-of-the-art performance (79.6% accuracy) on the CoNLL2012 shared task development set. Extrinsic evaluat...

متن کامل

One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations

Discourse relations bind smaller linguistic units into coherent texts. Automatically identifying discourse relations is difficult, because it requires understanding the semantics of the linked arguments. A more subtle challenge is that it is not enough to represent the meaning of each argument of a discourse relation, because the relation may depend on links between lowerlevel components, such ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013